Looking at word meaning. An interactive visualization of Semantic Vector Spaces for Dutch synsets
نویسندگان
چکیده
In statistical NLP, Semantic Vector Spaces (SVS) are the standard technique for the automatic modeling of lexical semantics. However, it is largely unclear how these black-box techniques exactly capture word meaning. To explore the way an SVS structures the individual occurrences of words, we use a non-parametric MDS solution of a token-by-token similarity matrix. The MDS solution is visualized in an interactive plot with the Google Chart Tools. As a case study, we look at the occurrences of 476 Dutch nouns grouped in 214 synsets.
منابع مشابه
Automatic Construction of Persian ICT WordNet using Princeton WordNet
WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...
متن کاملDimensionality Reduction in Semantic Vector Spaces Using a Derivational Resource
Lexical semantic vector spaces model the meaning of words by representing the cooccurrence statistics of words in certain contexts. Words are considered similar if they occur in similar contexts, so that word similarity can get predicted by comparing their representation in the semantic vector space. Two major problems in those vector spaces are their size and their sparsity. Due to the charact...
متن کاملFilaments of Meaning in Word Space
Word space models, in the sense of vector space models built on distributional data taken from texts, are used to model semantic relations between words. We argue that the high dimensionality of typical vector space models lead to unintuitive effects on modeling likeness of meaning and that the local structure of word spaces is where interesting semantic relations reside. We show that the local...
متن کاملChinese-English Bilingual Word Semantic Similarity Based on Chinese WordNet
Semantic similarity measurement of multilingual words is a challenging problem in data mining, information extraction, information retrieval, etc. This paper introduces an algorithm to measure the semantic similarity of Chinese-English bilingual words based on Chinese WordNet, an expansion of WordNet in Simplified Chinese. The algorithm not only measures the semantic similarity for Chinese and ...
متن کاملVisual Exploration of Word Vector Embeddings
The use of word vector embeddings as the basis for many upstream tasks in text processing has lead to large improvements in accuracy. However, the exact reasons for this success largely remain unclear, as the properties and relations that these embeddings encode are often not well understood. Our goal in this ongoing project is to design effective interactive visualizations that help practition...
متن کامل